🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents Github
Basic possibilities
- extracting text and layout Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers
- Local OCR Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)
- Determining the order of reading With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person
- Converting in Markdown Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder
Installation and requirements Python ≥ 3.10 (recommended 3.10.16).
Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).
For an EPUB conveier, you need access to the LLM service (for example, Deepseek).
🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents Github
Basic possibilities
- extracting text and layout Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers
- Local OCR Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)
- Determining the order of reading With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person
- Converting in Markdown Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder
Installation and requirements Python ≥ 3.10 (recommended 3.10.16).
Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).
For an EPUB conveier, you need access to the LLM service (for example, Deepseek).
The STAR Market, as is implied by the name, is heavily geared toward smaller innovative tech companies, in particular those engaged in strategically important fields, such as biopharmaceuticals, 5G technology, semiconductors, and new energy. The STAR Market currently has 340 listed securities. The STAR Market is seen as important for China’s high-tech and emerging industries, providing a space for smaller companies to raise capital in China. This is especially significant for technology companies that may be viewed with suspicion on overseas stock exchanges.
China’s stock markets are some of the largest in the world, with total market capitalization reaching RMB 79 trillion (US$12.2 trillion) in 2020. China’s stock markets are seen as a crucial tool for driving economic growth, in particular for financing the country’s rapidly growing high-tech sectors.Although traditionally closed off to overseas investors, China’s financial markets have gradually been loosening restrictions over the past couple of decades. At the same time, reforms have sought to make it easier for Chinese companies to list on onshore stock exchanges, and new programs have been launched in attempts to lure some of China’s most coveted overseas-listed companies back to the country.